Dataset statistics
| Number of variables | 10 |
|---|---|
| Number of observations | 3276 |
| Missing cells | 1434 |
| Missing cells (%) | 4.4% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 256.1 KiB |
| Average record size in memory | 80.0 B |
Variable types
| Numeric | 9 |
|---|---|
| Categorical | 1 |
ph has 491 (15.0%) missing values | Missing |
Sulfate has 781 (23.8%) missing values | Missing |
Trihalomethanes has 162 (4.9%) missing values | Missing |
Hardness has unique values | Unique |
Solids has unique values | Unique |
Chloramines has unique values | Unique |
Conductivity has unique values | Unique |
Organic_carbon has unique values | Unique |
Turbidity has unique values | Unique |
Reproduction
| Analysis started | 2022-12-28 10:40:34.874242 |
|---|---|
| Analysis finished | 2022-12-28 10:40:58.883786 |
| Duration | 24.01 seconds |
| Software version | pandas-profiling vdev |
| Download configuration | config.json |
ph
Real number (ℝ)
| Distinct | 2785 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 491 |
| Missing (%) | 15.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.0807945 |
| Minimum | 0 |
|---|---|
| Maximum | 14 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 25.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 4.4879707 |
| Q1 | 6.0930919 |
| median | 7.0367521 |
| Q3 | 8.0620661 |
| 95-th percentile | 9.7898186 |
| Maximum | 14 |
| Range | 14 |
| Interquartile range (IQR) | 1.9689742 |
Descriptive statistics
| Standard deviation | 1.5943195 |
|---|---|
| Coefficient of variation (CV) | 0.22516111 |
| Kurtosis | 0.72031558 |
| Mean | 7.0807945 |
| Median Absolute Deviation (MAD) | 0.984117 |
| Skewness | 0.025630448 |
| Sum | 19720.013 |
| Variance | 2.5418547 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 8.55409697 | 1 | < 0.1% |
| 6.538084087 | 1 | < 0.1% |
| 5.91580675 | 1 | < 0.1% |
| 8.136497869 | 1 | < 0.1% |
| 6.493764175 | 1 | < 0.1% |
| 6.977405633 | 1 | < 0.1% |
| 5.489248055 | 1 | < 0.1% |
| 2.558102799 | 1 | < 0.1% |
| 7.312109304 | 1 | < 0.1% |
| 6.704431913 | 1 | < 0.1% |
| Other values (2775) | 2775 | |
| (Missing) | 491 | 15.0% |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 0.2274990502 | 1 | |
| 0.9755779898 | 1 | |
| 0.9899122129 | 1 | |
| 1.431781555 | 1 | |
| 1.757037115 | 1 | |
| 1.844538366 | 1 | |
| 1.985383359 | 1 | |
| 2.128531434 | 1 | |
| 2.376768076 | 1 |
| Value | Count | Frequency (%) |
| 14 | 1 | |
| 13.54124024 | 1 | |
| 13.34988856 | 1 | |
| 13.17540172 | 1 | |
| 12.24692807 | 1 | |
| 11.90773983 | 1 | |
| 11.89807803 | 1 | |
| 11.62114013 | 1 | |
| 11.56876797 | 1 | |
| 11.56316906 | 1 |
Hardness
Real number (ℝ)
| Distinct | 3276 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 196.3695 |
| Minimum | 47.432 |
|---|---|
| Maximum | 323.124 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 25.7 KiB |
Quantile statistics
| Minimum | 47.432 |
|---|---|
| 5-th percentile | 141.76328 |
| Q1 | 176.85054 |
| median | 196.96763 |
| Q3 | 216.66746 |
| 95-th percentile | 249.60977 |
| Maximum | 323.124 |
| Range | 275.692 |
| Interquartile range (IQR) | 39.816918 |
Descriptive statistics
| Standard deviation | 32.879761 |
|---|---|
| Coefficient of variation (CV) | 0.16743823 |
| Kurtosis | 0.61577168 |
| Mean | 196.3695 |
| Median Absolute Deviation (MAD) | 19.844989 |
| Skewness | -0.039341705 |
| Sum | 643306.47 |
| Variance | 1081.0787 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 204.8904555 | 1 | < 0.1% |
| 134.5602761 | 1 | < 0.1% |
| 170.1909123 | 1 | < 0.1% |
| 237.4610992 | 1 | < 0.1% |
| 171.2389255 | 1 | < 0.1% |
| 197.4281988 | 1 | < 0.1% |
| 195.7440741 | 1 | < 0.1% |
| 184.2318535 | 1 | < 0.1% |
| 187.8732835 | 1 | < 0.1% |
| 205.1505644 | 1 | < 0.1% |
| Other values (3266) | 3266 |
| Value | Count | Frequency (%) |
| 47.432 | 1 | |
| 73.49223369 | 1 | |
| 77.4595861 | 1 | |
| 81.71089527 | 1 | |
| 94.09130748 | 1 | |
| 94.81254522 | 1 | |
| 94.90897713 | 1 | |
| 97.2809086 | 1 | |
| 98.3679149 | 1 | |
| 98.45293051 | 1 |
| Value | Count | Frequency (%) |
| 323.124 | 1 | |
| 317.3381241 | 1 | |
| 311.3839565 | 1 | |
| 308.2538329 | 1 | |
| 307.7060241 | 1 | |
| 306.6274814 | 1 | |
| 304.2359121 | 1 | |
| 303.7026267 | 1 | |
| 300.2924758 | 1 | |
| 298.0986795 | 1 |
Solids
Real number (ℝ)
| Distinct | 3276 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 22014.093 |
| Minimum | 320.94261 |
|---|---|
| Maximum | 61227.196 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 25.7 KiB |
Quantile statistics
| Minimum | 320.94261 |
|---|---|
| 5-th percentile | 9545.8126 |
| Q1 | 15666.69 |
| median | 20927.834 |
| Q3 | 27332.762 |
| 95-th percentile | 38474.99 |
| Maximum | 61227.196 |
| Range | 60906.253 |
| Interquartile range (IQR) | 11666.072 |
Descriptive statistics
| Standard deviation | 8768.5708 |
|---|---|
| Coefficient of variation (CV) | 0.39831625 |
| Kurtosis | 0.44282609 |
| Mean | 22014.093 |
| Median Absolute Deviation (MAD) | 5809.4719 |
| Skewness | 0.62163449 |
| Sum | 72118167 |
| Variance | 76887834 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 20791.31898 | 1 | < 0.1% |
| 15979.33479 | 1 | < 0.1% |
| 37000.95567 | 1 | < 0.1% |
| 18736.1909 | 1 | < 0.1% |
| 12289.90092 | 1 | < 0.1% |
| 15979.06027 | 1 | < 0.1% |
| 12431.80311 | 1 | < 0.1% |
| 30031.83918 | 1 | < 0.1% |
| 29532.615 | 1 | < 0.1% |
| 19821.33837 | 1 | < 0.1% |
| Other values (3266) | 3266 |
| Value | Count | Frequency (%) |
| 320.9426113 | 1 | |
| 728.7508296 | 1 | |
| 1198.943699 | 1 | |
| 1351.906979 | 1 | |
| 1372.091043 | 1 | |
| 2552.962804 | 1 | |
| 2808.025756 | 1 | |
| 2835.303165 | 1 | |
| 2912.211247 | 1 | |
| 3413.081633 | 1 |
| Value | Count | Frequency (%) |
| 61227.19601 | 1 | |
| 56867.85924 | 1 | |
| 56488.67241 | 1 | |
| 56351.3963 | 1 | |
| 56320.58698 | 1 | |
| 55334.7028 | 1 | |
| 53735.89919 | 1 | |
| 52318.9173 | 1 | |
| 52060.2268 | 1 | |
| 51731.82055 | 1 |
Chloramines
Real number (ℝ)
| Distinct | 3276 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.1222768 |
| Minimum | 0.352 |
|---|---|
| Maximum | 13.127 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 25.7 KiB |
Quantile statistics
| Minimum | 0.352 |
|---|---|
| 5-th percentile | 4.5030537 |
| Q1 | 6.1274208 |
| median | 7.130299 |
| Q3 | 8.114887 |
| 95-th percentile | 9.7531005 |
| Maximum | 13.127 |
| Range | 12.775 |
| Interquartile range (IQR) | 1.9874663 |
Descriptive statistics
| Standard deviation | 1.5830849 |
|---|---|
| Coefficient of variation (CV) | 0.22227231 |
| Kurtosis | 0.58990117 |
| Mean | 7.1222768 |
| Median Absolute Deviation (MAD) | 0.99166134 |
| Skewness | -0.01209844 |
| Sum | 23332.579 |
| Variance | 2.5061578 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 7.300211873 | 1 | < 0.1% |
| 9.504361027 | 1 | < 0.1% |
| 6.217222542 | 1 | < 0.1% |
| 5.599870342 | 1 | < 0.1% |
| 10.78649982 | 1 | < 0.1% |
| 7.424944591 | 1 | < 0.1% |
| 6.6616162 | 1 | < 0.1% |
| 6.21530731 | 1 | < 0.1% |
| 7.981036899 | 1 | < 0.1% |
| 6.344963412 | 1 | < 0.1% |
| Other values (3266) | 3266 |
| Value | Count | Frequency (%) |
| 0.352 | 1 | |
| 0.5303512947 | 1 | |
| 1.390870905 | 1 | |
| 1.683992581 | 1 | |
| 1.920271449 | 1 | |
| 2.102690991 | 1 | |
| 2.386653494 | 1 | |
| 2.39798499 | 1 | |
| 2.456013596 | 1 | |
| 2.458609195 | 1 |
| Value | Count | Frequency (%) |
| 13.127 | 1 | |
| 13.04380611 | 1 | |
| 12.91218664 | 1 | |
| 12.65336202 | 1 | |
| 12.62689974 | 1 | |
| 12.58002649 | 1 | |
| 12.36328483 | 1 | |
| 12.27937418 | 1 | |
| 12.2463941 | 1 | |
| 12.22717528 | 1 |
Sulfate
Real number (ℝ)
| Distinct | 2495 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 781 |
| Missing (%) | 23.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 333.77578 |
| Minimum | 129 |
|---|---|
| Maximum | 481.03064 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 25.7 KiB |
Quantile statistics
| Minimum | 129 |
|---|---|
| 5-th percentile | 266.61623 |
| Q1 | 307.6995 |
| median | 333.07355 |
| Q3 | 359.95017 |
| 95-th percentile | 403.07019 |
| Maximum | 481.03064 |
| Range | 352.03064 |
| Interquartile range (IQR) | 52.250673 |
Descriptive statistics
| Standard deviation | 41.41684 |
|---|---|
| Coefficient of variation (CV) | 0.12408582 |
| Kurtosis | 0.64826281 |
| Mean | 333.77578 |
| Median Absolute Deviation (MAD) | 26.095176 |
| Skewness | -0.035946622 |
| Sum | 832770.56 |
| Variance | 1715.3547 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 280.7456229 | 1 | < 0.1% |
| 332.7445192 | 1 | < 0.1% |
| 391.9182286 | 1 | < 0.1% |
| 330.9053704 | 1 | < 0.1% |
| 402.3134271 | 1 | < 0.1% |
| 360.6978151 | 1 | < 0.1% |
| 336.0404518 | 1 | < 0.1% |
| 405.5273372 | 1 | < 0.1% |
| 346.0636768 | 1 | < 0.1% |
| 368.5164413 | 1 | < 0.1% |
| Other values (2485) | 2485 | |
| (Missing) | 781 | 23.8% |
| Value | Count | Frequency (%) |
| 129 | 1 | |
| 180.2067464 | 1 | |
| 182.3973702 | 1 | |
| 187.1707144 | 1 | |
| 187.4241309 | 1 | |
| 192.0335917 | 1 | |
| 203.4445208 | 1 | |
| 205.9350906 | 1 | |
| 206.2472294 | 1 | |
| 207.8904823 | 1 |
| Value | Count | Frequency (%) |
| 481.0306423 | 1 | |
| 476.5397173 | 1 | |
| 475.7374602 | 1 | |
| 462.474215 | 1 | |
| 460.107069 | 1 | |
| 458.4410723 | 1 | |
| 455.4512337 | 1 | |
| 450.9144544 | 1 | |
| 449.2676875 | 1 | |
| 447.4179624 | 1 |
Conductivity
Real number (ℝ)
| Distinct | 3276 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 426.20511 |
| Minimum | 181.48375 |
|---|---|
| Maximum | 753.34262 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 25.7 KiB |
Quantile statistics
| Minimum | 181.48375 |
|---|---|
| 5-th percentile | 300.10947 |
| Q1 | 365.73441 |
| median | 421.88497 |
| Q3 | 481.7923 |
| 95-th percentile | 566.34932 |
| Maximum | 753.34262 |
| Range | 571.85887 |
| Interquartile range (IQR) | 116.05789 |
Descriptive statistics
| Standard deviation | 80.824064 |
|---|---|
| Coefficient of variation (CV) | 0.18963654 |
| Kurtosis | -0.27709283 |
| Mean | 426.20511 |
| Median Absolute Deviation (MAD) | 57.887591 |
| Skewness | 0.26449022 |
| Sum | 1396247.9 |
| Variance | 6532.5293 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 564.3086542 | 1 | < 0.1% |
| 418.6420628 | 1 | < 0.1% |
| 517.5767619 | 1 | < 0.1% |
| 235.0422835 | 1 | < 0.1% |
| 501.5597252 | 1 | < 0.1% |
| 452.1872326 | 1 | < 0.1% |
| 367.8540248 | 1 | < 0.1% |
| 400.6118991 | 1 | < 0.1% |
| 469.1321169 | 1 | < 0.1% |
| 482.5957093 | 1 | < 0.1% |
| Other values (3266) | 3266 |
| Value | Count | Frequency (%) |
| 181.483754 | 1 | |
| 201.6197368 | 1 | |
| 210.319182 | 1 | |
| 217.3583296 | 1 | |
| 232.613624 | 1 | |
| 233.9079651 | 1 | |
| 235.0422835 | 1 | |
| 245.859632 | 1 | |
| 247.9180305 | 1 | |
| 251.0208987 | 1 |
| Value | Count | Frequency (%) |
| 753.3426196 | 1 | |
| 708.2263645 | 1 | |
| 695.369528 | 1 | |
| 674.4434759 | 1 | |
| 672.5569992 | 1 | |
| 669.7250862 | 1 | |
| 666.6906183 | 1 | |
| 660.2549463 | 1 | |
| 657.5704218 | 1 | |
| 656.9241278 | 1 |
Organic_carbon
Real number (ℝ)
| Distinct | 3276 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14.28497 |
| Minimum | 2.2 |
|---|---|
| Maximum | 28.3 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 25.7 KiB |
Quantile statistics
| Minimum | 2.2 |
|---|---|
| 5-th percentile | 8.8153617 |
| Q1 | 12.065801 |
| median | 14.218338 |
| Q3 | 16.557652 |
| 95-th percentile | 19.637254 |
| Maximum | 28.3 |
| Range | 26.1 |
| Interquartile range (IQR) | 4.4918502 |
Descriptive statistics
| Standard deviation | 3.308162 |
|---|---|
| Coefficient of variation (CV) | 0.2315834 |
| Kurtosis | 0.044409307 |
| Mean | 14.28497 |
| Median Absolute Deviation (MAD) | 2.2322941 |
| Skewness | 0.025532582 |
| Sum | 46797.563 |
| Variance | 10.943936 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 10.37978308 | 1 | < 0.1% |
| 12.89763545 | 1 | < 0.1% |
| 15.87176979 | 1 | < 0.1% |
| 11.545477 | 1 | < 0.1% |
| 12.28433352 | 1 | < 0.1% |
| 18.58495937 | 1 | < 0.1% |
| 21.30064694 | 1 | < 0.1% |
| 15.28878163 | 1 | < 0.1% |
| 16.1692117 | 1 | < 0.1% |
| 12.16473568 | 1 | < 0.1% |
| Other values (3266) | 3266 |
| Value | Count | Frequency (%) |
| 2.2 | 1 | |
| 4.371898608 | 1 | |
| 4.466771969 | 1 | |
| 4.473092264 | 1 | |
| 4.861631498 | 1 | |
| 4.902888068 | 1 | |
| 4.966861619 | 1 | |
| 5.051694615 | 1 | |
| 5.159380308 | 1 | |
| 5.188466455 | 1 |
| Value | Count | Frequency (%) |
| 28.3 | 1 | |
| 27.00670661 | 1 | |
| 24.75539237 | 1 | |
| 23.95245044 | 1 | |
| 23.91760126 | 1 | |
| 23.66766678 | 1 | |
| 23.60429797 | 1 | |
| 23.56964491 | 1 | |
| 23.51477377 | 1 | |
| 23.39951606 | 1 |
Trihalomethanes
Real number (ℝ)
| Distinct | 3114 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 162 |
| Missing (%) | 4.9% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 66.396293 |
| Minimum | 0.738 |
|---|---|
| Maximum | 124 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 25.7 KiB |
Quantile statistics
| Minimum | 0.738 |
|---|---|
| 5-th percentile | 39.552928 |
| Q1 | 55.844536 |
| median | 66.622485 |
| Q3 | 77.337473 |
| 95-th percentile | 92.124059 |
| Maximum | 124 |
| Range | 123.262 |
| Interquartile range (IQR) | 21.492937 |
Descriptive statistics
| Standard deviation | 16.175008 |
|---|---|
| Coefficient of variation (CV) | 0.24361313 |
| Kurtosis | 0.23859744 |
| Mean | 66.396293 |
| Median Absolute Deviation (MAD) | 10.742172 |
| Skewness | -0.083030674 |
| Sum | 206758.06 |
| Variance | 261.6309 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 86.99097046 | 1 | < 0.1% |
| 56.71550955 | 1 | < 0.1% |
| 77.73081437 | 1 | < 0.1% |
| 90.39489472 | 1 | < 0.1% |
| 37.78709664 | 1 | < 0.1% |
| 78.9255271 | 1 | < 0.1% |
| 89.47771837 | 1 | < 0.1% |
| 69.526718 | 1 | < 0.1% |
| 72.57395938 | 1 | < 0.1% |
| 57.78086932 | 1 | < 0.1% |
| Other values (3104) | 3104 | |
| (Missing) | 162 | 4.9% |
| Value | Count | Frequency (%) |
| 0.738 | 1 | |
| 8.175876384 | 1 | |
| 8.577012933 | 1 | |
| 14.34316145 | 1 | |
| 15.6848768 | 1 | |
| 16.2915046 | 1 | |
| 17.00068293 | 1 | |
| 17.52776496 | 1 | |
| 17.91572257 | 1 | |
| 18.01527236 | 1 |
| Value | Count | Frequency (%) |
| 124 | 1 | |
| 120.030077 | 1 | |
| 118.3572747 | 1 | |
| 116.1616216 | 1 | |
| 114.2086714 | 1 | |
| 114.0349457 | 1 | |
| 113.0488857 | 1 | |
| 112.622733 | 1 | |
| 112.4122104 | 1 | |
| 112.0610274 | 1 |
Turbidity
Real number (ℝ)
| Distinct | 3276 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.9667862 |
| Minimum | 1.45 |
|---|---|
| Maximum | 6.739 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 25.7 KiB |
Quantile statistics
| Minimum | 1.45 |
|---|---|
| 5-th percentile | 2.6842792 |
| Q1 | 3.4397109 |
| median | 3.9550276 |
| Q3 | 4.5003198 |
| 95-th percentile | 5.2209245 |
| Maximum | 6.739 |
| Range | 5.289 |
| Interquartile range (IQR) | 1.0606089 |
Descriptive statistics
| Standard deviation | 0.78038241 |
|---|---|
| Coefficient of variation (CV) | 0.19672913 |
| Kurtosis | -0.062800641 |
| Mean | 3.9667862 |
| Median Absolute Deviation (MAD) | 0.53029624 |
| Skewness | -0.0078166424 |
| Sum | 12995.191 |
| Variance | 0.6089967 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 2.963135381 | 1 | < 0.1% |
| 3.987012091 | 1 | < 0.1% |
| 4.066229364 | 1 | < 0.1% |
| 3.759326201 | 1 | < 0.1% |
| 4.876273 | 1 | < 0.1% |
| 5.143750122 | 1 | < 0.1% |
| 4.513200539 | 1 | < 0.1% |
| 4.20418585 | 1 | < 0.1% |
| 4.586748359 | 1 | < 0.1% |
| 4.910911021 | 1 | < 0.1% |
| Other values (3266) | 3266 |
| Value | Count | Frequency (%) |
| 1.45 | 1 | |
| 1.492206615 | 1 | |
| 1.496100943 | 1 | |
| 1.64151501 | 1 | |
| 1.659799385 | 1 | |
| 1.680554025 | 1 | |
| 1.687624505 | 1 | |
| 1.801326999 | 1 | |
| 1.81252894 | 1 | |
| 1.844371604 | 1 |
| Value | Count | Frequency (%) |
| 6.739 | 1 | |
| 6.494748556 | 1 | |
| 6.494249467 | 1 | |
| 6.389161009 | 1 | |
| 6.35743852 | 1 | |
| 6.307678472 | 1 | |
| 6.226580405 | 1 | |
| 6.204846359 | 1 | |
| 6.099631873 | 1 | |
| 6.083772354 | 1 |
Potability
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 25.7 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 3276 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1998 | |
| 1 | 1278 |
Length
Histogram of lengths of the category
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 1998 | |
| 1 | 1278 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1998 | |
| 1 | 1278 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 3276 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1998 | |
| 1 | 1278 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3276 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1998 | |
| 1 | 1278 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3276 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1998 | |
| 1 | 1278 |
| ph | Hardness | Solids | Chloramines | Sulfate | Conductivity | Organic_carbon | Trihalomethanes | Turbidity | Potability | |
|---|---|---|---|---|---|---|---|---|---|---|
| ph | 1.000 | 0.116 | -0.075 | -0.042 | 0.024 | 0.017 | 0.044 | 0.005 | -0.049 | 0.084 |
| Hardness | 0.116 | 1.000 | -0.053 | -0.025 | -0.095 | -0.033 | 0.003 | -0.012 | -0.013 | 0.079 |
| Solids | -0.075 | -0.053 | 1.000 | -0.055 | -0.154 | 0.021 | 0.018 | -0.020 | 0.028 | 0.025 |
| Chloramines | -0.042 | -0.025 | -0.055 | 1.000 | 0.037 | -0.017 | -0.012 | 0.018 | -0.008 | 0.077 |
| Sulfate | 0.024 | -0.095 | -0.154 | 0.037 | 1.000 | -0.022 | 0.020 | -0.031 | -0.019 | 0.151 |
| Conductivity | 0.017 | -0.033 | 0.021 | -0.017 | -0.022 | 1.000 | 0.021 | -0.004 | 0.010 | 0.000 |
| Organic_carbon | 0.044 | 0.003 | 0.018 | -0.012 | 0.020 | 0.021 | 1.000 | -0.008 | -0.025 | 0.015 |
| Trihalomethanes | 0.005 | -0.012 | -0.020 | 0.018 | -0.031 | -0.004 | -0.008 | 1.000 | -0.028 | 0.000 |
| Turbidity | -0.049 | -0.013 | 0.028 | -0.008 | -0.019 | 0.010 | -0.025 | -0.028 | 1.000 | 0.000 |
| Potability | 0.084 | 0.079 | 0.025 | 0.077 | 0.151 | 0.000 | 0.015 | 0.000 | 0.000 | 1.000 |
A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
| ph | Hardness | Solids | Chloramines | Sulfate | Conductivity | Organic_carbon | Trihalomethanes | Turbidity | Potability | |
|---|---|---|---|---|---|---|---|---|---|---|
| 0 | NaN | 204.890455 | 20791.318981 | 7.300212 | 368.516441 | 564.308654 | 10.379783 | 86.990970 | 2.963135 | 0 |
| 1 | 3.716080 | 129.422921 | 18630.057858 | 6.635246 | NaN | 592.885359 | 15.180013 | 56.329076 | 4.500656 | 0 |
| 2 | 8.099124 | 224.236259 | 19909.541732 | 9.275884 | NaN | 418.606213 | 16.868637 | 66.420093 | 3.055934 | 0 |
| 3 | 8.316766 | 214.373394 | 22018.417441 | 8.059332 | 356.886136 | 363.266516 | 18.436524 | 100.341674 | 4.628771 | 0 |
| 4 | 9.092223 | 181.101509 | 17978.986339 | 6.546600 | 310.135738 | 398.410813 | 11.558279 | 31.997993 | 4.075075 | 0 |
| 5 | 5.584087 | 188.313324 | 28748.687739 | 7.544869 | 326.678363 | 280.467916 | 8.399735 | 54.917862 | 2.559708 | 0 |
| 6 | 10.223862 | 248.071735 | 28749.716544 | 7.513408 | 393.663396 | 283.651634 | 13.789695 | 84.603556 | 2.672989 | 0 |
| 7 | 8.635849 | 203.361523 | 13672.091764 | 4.563009 | 303.309771 | 474.607645 | 12.363817 | 62.798309 | 4.401425 | 0 |
| 8 | NaN | 118.988579 | 14285.583854 | 7.804174 | 268.646941 | 389.375566 | 12.706049 | 53.928846 | 3.595017 | 0 |
| 9 | 11.180284 | 227.231469 | 25484.508491 | 9.077200 | 404.041635 | 563.885481 | 17.927806 | 71.976601 | 4.370562 | 0 |
| ph | Hardness | Solids | Chloramines | Sulfate | Conductivity | Organic_carbon | Trihalomethanes | Turbidity | Potability | |
|---|---|---|---|---|---|---|---|---|---|---|
| 3266 | 8.372910 | 169.087052 | 14622.745494 | 7.547984 | NaN | 464.525552 | 11.083027 | 38.435151 | 4.906358 | 1 |
| 3267 | 8.989900 | 215.047358 | 15921.412018 | 6.297312 | 312.931022 | 390.410231 | 9.899115 | 55.069304 | 4.613843 | 1 |
| 3268 | 6.702547 | 207.321086 | 17246.920347 | 7.708117 | 304.510230 | 329.266002 | 16.217303 | 28.878601 | 3.442983 | 1 |
| 3269 | 11.491011 | 94.812545 | 37188.826022 | 9.263166 | 258.930600 | 439.893618 | 16.172755 | 41.558501 | 4.369264 | 1 |
| 3270 | 6.069616 | 186.659040 | 26138.780191 | 7.747547 | 345.700257 | 415.886955 | 12.067620 | 60.419921 | 3.669712 | 1 |
| 3271 | 4.668102 | 193.681735 | 47580.991603 | 7.166639 | 359.948574 | 526.424171 | 13.894419 | 66.687695 | 4.435821 | 1 |
| 3272 | 7.808856 | 193.553212 | 17329.802160 | 8.061362 | NaN | 392.449580 | 19.903225 | NaN | 2.798243 | 1 |
| 3273 | 9.419510 | 175.762646 | 33155.578218 | 7.350233 | NaN | 432.044783 | 11.039070 | 69.845400 | 3.298875 | 1 |
| 3274 | 5.126763 | 230.603758 | 11983.869376 | 6.303357 | NaN | 402.883113 | 11.168946 | 77.488213 | 4.708658 | 1 |
| 3275 | 7.874671 | 195.102299 | 17404.177061 | 7.509306 | NaN | 327.459760 | 16.140368 | 78.698446 | 2.309149 | 1 |